Collective Framework and Performance Optimizations to Open MPI for Cray XT Platforms

نویسندگان

  • Joshua S. Ladd
  • Manjunath Gorentla Venkata
  • Pavel Shamis
  • Richard L. Graham
چکیده

The performance and scalability of collective operations plays a key role in the performance and scalability of many scientific applications. Within the Open MPI code base we have developed a general purpose hierarchical collective operations framework called Cheetah, and applied it at large scale on the Oak Ridge Leadership Computing Facility’s Jaguar (OLCF) platform, obtaining better performance and scalability than the native MPI implementation. This paper discuss Cheetah’s design and implementation, and optimizations to the framework for Cray XT 5 platforms. Our results show that the Cheetah’s Broadcast and Barrier perform better than the native MPI implementation. For medium data, the Cheetah’s Broadcast outperforms the native MPI implementation by 93% for 49,152 processes problem size. For small and large data, it out performs the native MPI implementation by 10% and 9%, respectively, at 24,576 processes problem size. The Cheetah’s Barrier performs 10% better than the native MPI implementation for 12,288 processes problem size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Evaluation of Open MPI's Matching Transport Layer on the Cray XT

Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the MPI library to manage message matching, which is sub-optimal for some networks that inherently support match...

متن کامل

A Comparison of Application Performance Using Open MPI and Cray MPI

Open MPI is the result of an active international Open-Source collaboration of Industry, National Laboratories, and Academia. This implementation is becoming the production MPI implementation at many sites, including some of DOE’s largest Linux production systems. This paper presents the results of a study comparing the application performance of VH-1, GTC, the Parallel Ocean Program, and S3D o...

متن کامل

Implementation of Open MPI on the Cray XT3

The Open MPI implementation provides a high performance MPI-2 implementation for a wide variety of platforms. Open MPI has recently been ported to the Cray XT3 platform. This paper discusses the challenges of porting and describes important implementation decisions. A comparison of performance results between Open MPI and the Cray supported implementation of MPICH2 are also presented.

متن کامل

Open MPI for Cray XE/XK Systems

Open MPI provides an implementation of the MPI standard supporting communication over a range of highperformance network interfaces. Recently, Oak Ridge National Laboratory (ORNL) and Los Alamos National Laboratory (LANL) collaborated on creating a port of Open MPI for Gemini, the network interface for Cray XE and XK systems. In this paper, we present our design and implementation of Open MPI’s...

متن کامل

Optimizing MPI Collectives for X1

Traditionally MPI collective operations have been based on point-to-point messages, with possible optimizations for system topologies and communication protocols. The Cray X1 scatter/gather hardware and shared memory mapping features allow for significantly different approaches to MPI collectives leading to substantial performance gains over standard methods, especially for short message length...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011